# Few-shot fine-tuning

Swf Trained Model
Apache-2.0
This model is a fine-tuned image segmentation model based on mukesh3444/window_detection_model on the nagarajuthirupathi/indoor_window_detection_swf dataset, focusing on indoor window detection tasks.
Image Segmentation Transformers
S
nagarajuthirupathi
132
0
Plushies
Openrail
This is a text-to-image generation model based on the Flax framework, specifically designed for generating plush toy-style images.
Text-to-Image English
P
camenduru
19
22
Segformer B0 Finetuned Morphpadver1 Hgo Coord
Other
An image segmentation model fine-tuned based on nvidia/mit-b0, performing excellently on the NICOPOI-9/morphpad_coord_hgo_512_4class dataset
Image Segmentation Transformers
S
NICOPOI-9
98
0
Light R1 32B DS
Apache-2.0
Light-R1-32B-DS is a near-SOTA level 32B mathematical model, fine-tuned based on DeepSeek-R1-Distill-Qwen-32B, achieving high performance with only 3K SFT data.
Large Language Model Transformers
L
qihoo360
1,136
13
Tunisian TTS
XTTS V2 text-to-speech model fine-tuned on a custom Tunisian dataset
Speech Synthesis Arabic
T
amenIKh
48
2
Granite Timeseries Ttm R2
Apache-2.0
TinyTimeMixers (TTMs) are compact pretrained models for multivariate time-series forecasting open-sourced by IBM Research, starting from 1 million parameters, introducing the concept of 'miniature' pretrained models in time-series forecasting for the first time.
Climate Model
G
ibm-granite
217.99k
89
Urdu Text To Speech Tts
MIT
Urdu TTS model fine-tuned from microsoft/speecht5_tts with small training dataset (4,200 sentences). Commercial use requires retraining.
Speech Synthesis Transformers Other
U
HamzaSidhu786
46
2
Speecht5 Base Cs Tts
This is a monolingual Czech SpeechT5 base model, pre-trained on 120,000 hours of Czech audio and a 17.5 billion-word text corpus, designed as a starting point for Czech TTS fine-tuning.
Speech Synthesis Transformers Other
S
fav-kky
66
0
Florence 2 DocVQA
This is a version of Microsoft's Florence-2 model fine-tuned for 1 day using the Docmatix dataset (5% of the data) with a learning rate of 1e-6
Text-to-Image Transformers
F
HuggingFaceM4
3,096
60
Kosmos 2 PokemonCards Trl Merged
This is a multimodal model fine-tuned based on Microsoft's Kosmos-2 model, specifically designed for recognizing Pokemon names on Pokemon cards.
Image-to-Text Transformers English
K
Mit1208
51
1
Llama 3 8b Patent Small Dataset
Other
A model fine-tuned on a small dataset of 16,000 English translations of Korean patents based on Meta-Llama-3-8B-Instruct, for testing purposes only.
Large Language Model Transformers English
L
kimhyeongjun
17
4
Gemma 1.1 7b It Fictional Chinese V1
Chinese language model fine-tuned on the generator dataset based on google/gemma-1.1-7b-it
Large Language Model Transformers
G
yzhuang
21
1
Videomae Base Finetuned Subset
A video understanding model fine-tuned on an unknown dataset based on the MCG-NJU/videomae-base model, with an accuracy of 67.13%
Video Processing Transformers
V
Joy28
2
0
Mms Spa Finetuned Colombian Monospeaker
This is a Spanish TTS model based on MMS, fine-tuned using the VITS architecture, requiring only 80-150 samples and 20 minutes of training time to generate Spanish speech with a Colombian accent.
Speech Synthesis Transformers Spanish
M
ylacombe
71
1
Mms Spa Finetuned Argentinian Monospeaker
This is a fine-tuned model based on the MMS Spanish version, built using the VITS architecture, trained with only 80 to 150 samples in approximately 20 minutes.
Speech Synthesis Transformers Spanish
M
ylacombe
88
3
Distil Ast Audioset Finetuned Cry
Apache-2.0
An audio classification model fine-tuned on the DonateACry dataset based on bookbot/distil-ast-audioset, designed for identifying infant cries
Audio Classification Transformers
D
jstoone
76
1
Abap Nous Hermes
Apache-2.0
This is an ABAP programming language model fine-tuned based on Llama-2-7b-chat-hf, specifically designed for generating ABAP code
Large Language Model Transformers English
A
smjain
51
1
Segformer Finetuned Ihc
Other
Image segmentation model fine-tuned on the Isaacks/ihc_slide_tissue dataset based on nvidia/mit-b0 model
Image Segmentation Transformers
S
Isaacks
14
0
Digit Mask Unispeech Sat Base Ft
A voice processing model fine-tuned based on microsoft/unispeech-sat-base, specializing in digit masking tasks, with outstanding performance on evaluation sets.
Speech Recognition Transformers
D
mazkooleg
25
0
Swin Tiny Patch4 Window7 224 Finetuned Eurosat
Apache-2.0
A vision model fine-tuned on an image folder dataset based on microsoft/swin-tiny-patch4-window7-224
Image Classification Transformers
S
fyztkr
19
0
Vit Base Railspace
Apache-2.0
A Vision Transformer model fine-tuned from google/vit-base-patch16-224-in21k, achieving 99.26% accuracy on the evaluation set
Image Classification Transformers
V
Kaspar
18
2
Donut Base Finetuned Latvian Receipts V2
MIT
A model based on the Donut architecture, specifically fine-tuned for Latvian receipt data
Text Recognition Transformers
D
Inesence
13
0
Donut Base Finetuned Latvian Receipts
MIT
This model is a fine-tuned version of donut-base on a Latvian receipt dataset, primarily used for receipt image processing tasks
Text Recognition Transformers
D
Inesence
31
0
Deit Tiny Patch16 224 Finetuned Og Dataset 10e
Apache-2.0
A lightweight image classification model based on the DeiT-tiny architecture, achieving 94.8% accuracy after fine-tuning on a custom image dataset
Image Classification Transformers
D
Gokulapriyan
17
0
Whisper Medium Catalan
Apache-2.0
This is a speech recognition model fine-tuned on the Catalan Common Voice 11.0 dataset based on OpenAI Whisper Medium.
Speech Recognition Transformers Other
W
shields
19
2
Beit Base Patch16 224 Pt22k Ft22k Finetuned FER2013CKPlus 7e 05 Finetuned SFEW 7e 05
Apache-2.0
A vision Transformer model based on the BEiT architecture, fine-tuned on FER2013CKPlus and SFEW datasets for facial expression recognition tasks.
Image Classification Transformers
B
lixiqi
17
0
Vit Base Patch16 224 In21k Lcbsi
Apache-2.0
A fine-tuned model based on Google Vision Transformer (ViT) architecture, suitable for image classification tasks
Image Classification Transformers
V
polejowska
33
0
Swin Tiny Patch4 Window7 224 Finetuned Eurosat
Apache-2.0
This is a tiny version of the image classification model based on the Swin Transformer architecture, fine-tuned on the EuroSAT dataset, suitable for remote sensing image classification tasks.
Image Classification Transformers
S
QIANWEI
11
0
Convnext Small 224 Leicester Binary
Apache-2.0
An image classification model fine-tuned for binary classification tasks based on facebook/convnext-small-224, achieving an F1 score of 0.9620 on the evaluation set
Image Classification Transformers
C
davanstrien
13
0
Convnext Tiny 224 Finetuned On Unlabelled IA With Snorkel Labels
This is a computer vision model based on the ConvNeXt-Tiny architecture, fine-tuned on unsupervised data using pseudo-labels generated by Snorkel
Image Classification Transformers
C
ImageIN
14
0
Bart Base Few Shot K 128 Finetuned Squad Seed 4
Apache-2.0
A question-answering model based on the BART-base architecture, fine-tuned on the SQuAD dataset, suitable for reading comprehension tasks.
Question Answering System Transformers
B
anas-awadalla
13
0
Convnext Tiny 224 Finetuned
Apache-2.0
This model is a fine-tuned version based on facebook/convnext-tiny-224, primarily used for image classification tasks, demonstrating excellent performance on the evaluation set.
Image Classification Transformers
C
ImageIN
15
0
Vit Base Patch16 224 In21k Wwwwii
Apache-2.0
A vision classification model fine-tuned based on Google's Vision Transformer (ViT) foundation model, suitable for image classification tasks
Image Classification Transformers
V
Imene
21
0
Swin Tiny Patch4 Window7 224 Finetuned Eurosat Kornia
Apache-2.0
A fine-tuned image classification model based on the Swin Transformer architecture, achieving 98.3% accuracy on the image folder dataset.
Image Classification Transformers
S
nielsr
16
0
Swin Tiny Patch4 Window7 224 Finetuned Eurosat
Apache-2.0
A fine-tuned image classification model based on Swin Transformer architecture, achieving 93.94% accuracy on the EuroSAT dataset
Image Classification Transformers
S
Chandanab
13
0
Swin Tiny Patch4 Window7 224 Finetuned Eurosat
Apache-2.0
Fine-tuned image classification model based on Swin Transformer architecture, achieving 98% accuracy on the EuroSAT dataset
Image Classification Transformers
S
HekmatTaherinejad
16
0
Swin Tiny Patch4 Window7 224 Finetuned Eurosat
Apache-2.0
A fine-tuned image classification model based on Swin Transformer Tiny architecture, achieving 97.26% accuracy on the EuroSAT dataset
Image Classification Transformers
S
aricibo
15
0
Beit Finetuned
Apache-2.0
This model is a fine-tuned BEiT base version on the CIFAR-10 dataset, focusing on image classification tasks, achieving 99.18% accuracy on the evaluation set.
Image Classification Transformers
B
jadohu
24
1
Swin Tiny Patch4 Window7 224 Finetuned Eurosat
Apache-2.0
This is a fine-tuned model based on the Swin Transformer Tiny architecture, specifically designed for image classification tasks, achieving an accuracy of 97.59% on the evaluation set.
Image Classification Transformers
S
jemole
14
0
Swin Tiny Patch4 Window7 224 Finetuned Eurosat
Apache-2.0
A fine-tuned image classification model based on the Swin Transformer architecture, achieving 97.44% accuracy on the image folder dataset
Image Classification Transformers
S
nielsr
51
3
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase